TOP 50 Cryptocurrencies Historical Prices¶
Nikulin Maxim DSBA 243-2
Introduction: The cryptocurrency market has evolved dramatically in recent years, with the total market capitalization reaching $3.46 trillion as of 2024 . This analysis focuses on examining the historical price movements and market trends of the top 50 cryptocurrencies, providing valuable insights into the digital asset ecosystem's development and dynamics.
The data of this dataset can be found by following link: https://www.kaggle.com/datasets/odins0n/top-50-cryptocurrency-historical-prices?resource=download
Dataset description:
- Date: Date of observation
- Price: Price on the given day (Also the closing price for that day)
- Open: Opening price on the given day
- High: Highest price on the given day
- Low: Lowest price on the given day
- Volume: Volume of transactions on the given day
- Change%: Percentage Change from the previous day
Main part:
Importing main libraries to our project:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.io as pio
import plotly.graph_objects as go
import plotly.io as pio
This is the heading of the dataset:
df = pd.read_csv(r'C:\Users\m4xni\Desktop\Проекты\HSE_project\Nikulin_243_2\Aave.csv')
print(df.head(10))
SNo Date Price Open High Low Vol. Change % 0 1 2018-01-30 0.15 0.17 0.17 0.14 530470.0 -7.95 1 2 2018-01-31 0.14 0.15 0.15 0.13 396050.0 -11.10 2 3 2018-02-01 0.11 0.14 0.14 0.11 987260.0 -17.46 3 4 2018-02-02 0.10 0.11 0.11 0.08 1810000.0 -8.32 4 5 2018-02-03 0.11 0.10 0.12 0.09 1200000.0 6.85 5 6 2018-02-04 0.09 0.11 0.12 0.09 1040000.0 -18.16 6 7 2018-02-05 0.07 0.09 0.09 0.06 756000.0 -24.39 7 8 2018-02-06 0.09 0.07 0.09 0.05 819460.0 26.28 8 9 2018-02-07 0.08 0.09 0.09 0.07 890850.0 -10.06 9 10 2018-02-08 0.09 0.08 0.09 0.08 211470.0 15.81
1. In our project all columns are numeric. So, let's find:
- Medival values,
- Avarage values,
- Standard deviation of fields.
print("\033[1mMedian values:\033[0m")
print((df[['Price', 'Open', 'High', 'Low', 'Vol.', 'Change %']].median()).round(2))
print()
print("\033[1mMean values:\033[0m")
print((df[['Price', 'Open', 'High', 'Low', 'Vol.', 'Change %']].mean()).round(2))
print()
print("\033[1mStandart devitation values:\033[0m")
print((df[['Price', 'Open', 'High', 'Low', 'Vol.', 'Change %']].std()).round(2))
print()
Median values: Price 0.03 Open 0.03 High 0.03 Low 0.03 Vol. 310590.00 Change % 0.00 dtype: float64 Mean values: Price 67.05 Open 66.74 High 71.13 Low 62.49 Vol. 674127.40 Change % 5.45 dtype: float64 Standart devitation values: Price 139.96 Open 139.63 High 148.90 Low 130.72 Vol. 1077260.55 Change % 176.11 dtype: float64
2. Let's check rows with NaN values:
print(df.describe())
SNo Price Open High Low \
count 1275.000000 1275.000000 1275.000000 1275.000000 1275.000000
mean 638.000000 67.045906 66.742322 71.129875 62.490110
std 368.205106 139.960408 139.634444 148.895685 130.723039
min 1.000000 0.000000 0.000000 0.000000 0.000000
25% 319.500000 0.010000 0.010000 0.010000 0.010000
50% 638.000000 0.030000 0.030000 0.030000 0.030000
75% 956.500000 0.580000 0.580000 0.620000 0.540000
max 1275.000000 629.380000 629.380000 665.180000 564.850000
Vol. Change %
count 1.275000e+03 1275.000000
mean 6.741274e+05 5.454431
std 1.077261e+06 176.107560
min 0.000000e+00 -38.080000
25% 5.339500e+04 0.000000
50% 3.105900e+05 0.000000
75% 8.440000e+05 0.000000
max 1.050000e+07 6284.530000
df = df.drop_duplicates()
We see that all types are correct and no null values were found, so data is clean
3. Now we can create some graphs, which are based on our dataset
For the easiest way to analyze data I prefer to take first 50 elements
df['Date'] = pd.to_datetime(df['Date'])
df_50 = df.head(50).copy()
df_50['Timestamp'] = df_50['Date'].map(pd.Timestamp.timestamp)
x_50 = df_50['Timestamp']
y_50 = df_50['Price']
coefficients_50 = np.polyfit(x_50, y_50, 1)
trend_line_50 = np.poly1d(coefficients_50)
fig = px.line(
df_50,
x='Date',
y='Price',
labels={'Date': 'Date', 'Price': 'Price (USD)'}
)
fig.add_scatter(
x=df_50['Date'],
y=trend_line_50(x_50),
mode='lines',
name='Trend Line',
line=dict(color='red', dash='dash')
)
fig.add_scatter(
x=df_50['Date'],
y=df_50['High'],
mode='markers',
name='High',
marker=dict(color='green', size=8)
)
fig.add_scatter(
x=df_50['Date'],
y=df_50['Low'],
mode='markers',
name='Low',
marker=dict(color='red', size=8)
)
fig.update_layout(
xaxis_title='Date',
yaxis_title='Price (USD)',
template='plotly_white',
title_x=0.5,
width=1000,
height=600
)
fig.show()
Analyzing this graph it can be seen that from 2018-02-01 to 2018-03-22 there was a decrease in prices of Krypto currences. Also there is a scatter diagram which shows higest price(green) and the lowest price(red) in particular data
Now, let's create bar chart
df['Date'] = pd.to_datetime(df['Date'])
df_50 = df.head(50).copy()
fig = px.bar(
df_50,
x='Date',
y=['Price', 'High', 'Low'],
labels={'Date': 'Date', 'value': 'Price (USD)', 'variable': 'Metrics'},
barmode='group'
)
fig.update_layout(
title = 'Bar chart',
xaxis_title='Date',
yaxis_title='Price (USD)',
template='plotly_white',
title_x=0.5,
width=1000,
height=600
)
fig.show()
By looking at the chart, you can identify patterns or trends, such as whether the price remains stable or fluctuates significantly between High and Low values over the selected timeframe. For any given date, the High bar will always be above or equal to Price, and Low will be below or equal to Price.
df['Date'] = pd.to_datetime(df['Date'])
df_50 = df.head(50).copy()
fig = go.Figure()
fig.add_trace(go.Candlestick(
x=df_50['Date'], # Dates for the x-axis
open=df_50['Change %'], # Change as opening values
high=df_50['Change %'] + df_50['Vol.'], # Change + Vol. as high
low=df_50['Change %'] - df_50['Vol.'], # Change - Vol. as low
close=df_50['Change %'], # Change as closing values
increasing_line_color='green', # Color for increasing candles
decreasing_line_color='red', # Color for decreasing candles
name='Candlestick (Vol. & Change)'
))
fig.update_layout(
title = 'Candlestick chart',
xaxis_title='Date',
yaxis_title='Change & Volume',
template='plotly_white',
title_x=0.5,
width=1000,
height=600
)
fig.show()
X-Axis (Dates): The horizontal axis shows the dates, representing the time period covered by the first 50 data points in chronological order. This allows tracking changes and volumes over specific time intervals.
Y-Axis (Change & Volume): The vertical axis represents the Change and its fluctuation influenced by Vol. Positive and negative values of Change are visualized, with Vol. determining the range for each candlestick. Candlestick Components:
Open & Close (Change): The candlestick's body represents the Change value during the specified period.
High & Low (Vol.): The wicks (lines above and below the body) show the maximum and minimum values of Change, calculated using Vol.: High = Change + Vol. Low = Change - Vol.
Colors: Green Candlesticks: Represent an increase or no change during the time period. Red Candlesticks: Indicate a decrease in value during the time period.
Insights: This chart provides a visual representation of the volatility in Change influenced by the Vol. parameter. It can help identify periods of high fluctuation (large candlesticks) or stability (small candlesticks).
df['Date'] = pd.to_datetime(df['Date'])
df_50 = df.head(50).copy()
fig = go.Figure()
fig.add_trace(go.Scatter(
x=df_50['Date'],
y=df_50['High'],
mode='lines',
line=dict(color='green', width=1),
name='High'
))
fig.add_trace(go.Scatter(
x=df_50['Date'],
y=df_50['Low'],
mode='lines',
line=dict(color='red', width=1),
name='Low'
))
fig.add_trace(go.Scatter(
x=df_50['Date'],
y=df_50['Open'],
mode='markers',
marker=dict(color='blue', size=6),
name='Open'
))
fig.update_layout(
title='OHLC Chart',
xaxis_title='Date',
yaxis_title='Price (USD)',
template='plotly_white',
title_x=0.5,
width=1000,
height=600
)
fig.show()
The green line represents the highest price achieved during each time interval, while the red line shows the lowest price. The blue markers indicate the starting prices, providing a point of reference for each interval's price movements.
Periods with larger gaps between the green and red lines indicate high volatility, reflecting significant fluctuations in price during those intervals. Conversely, narrower gaps suggest stability with minimal price movement. The blue markers often align closer to either the green or red lines, reflecting whether the price trend was predominantly upward or downward.
This visualization captures short-term price behavior, showing trends of increase or decrease over time. Steep divergences between high and low values may indicate impactful market events or increased trading activity, while closely aligned lines suggest quieter market conditions. The chart serves as a concise tool for identifying trends and assessing market volatility within the selected period.
x_col = 'High'
y_col = 'Low'
z_col = 'Change %'
data_50 = df.head(50)
# Group by to aggregate duplicates (if any)
data_grouped = data_50.groupby([y_col, x_col], as_index=False)[z_col].mean()
heatmap_data = data_grouped.pivot(index=y_col, columns=x_col, values=z_col)
fig = px.imshow(
heatmap_data,
color_continuous_scale="RdBu",
zmin=heatmap_data.min().min(),
zmax=heatmap_data.max().max(),
title="Heatmap",
labels={"color": "Intensity"}
)
fig.update_layout(
width=1200,
height=800,
)
fig.show()